Reinforcement learning with modulated spike timing dependent synaptic plasticity.

نویسندگان

  • Michael A Farries
  • Adrienne L Fairhall
چکیده

Spike timing-dependent synaptic plasticity (STDP) has emerged as the preferred framework linking patterns of pre- and postsynaptic activity to changes in synaptic strength. Although synaptic plasticity is widely believed to be a major component of learning, it is unclear how STDP itself could serve as a mechanism for general purpose learning. On the other hand, algorithms for reinforcement learning work on a wide variety of problems, but lack an experimentally established neural implementation. Here, we combine these paradigms in a novel model in which a modified version of STDP achieves reinforcement learning. We build this model in stages, identifying a minimal set of conditions needed to make it work. Using a performance-modulated modification of STDP in a two-layer feedforward network, we can train output neurons to generate arbitrarily selected spike trains or population responses. Furthermore, a given network can learn distinct responses to several different input patterns. We also describe in detail how this model might be implemented biologically. Thus our model offers a novel and biologically plausible implementation of reinforcement learning that is capable of training a neural population to produce a very wide range of possible mappings between synaptic input and spiking output.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement Learning Through Modulation of Spike-Timing-Dependent Synaptic Plasticity

The persistent modification of synaptic efficacy as a function of the relative timing of pre- and postsynaptic spikes is a phenomenon known as spike-timing-dependent plasticity (STDP). Here we show that the modulation of STDP by a global reward signal leads to reinforcement learning. We first derive analytically learning rules involving reward-modulated spike-timing-dependent synaptic and intri...

متن کامل

Reinforcement Learning with Modulated Spike Timing-Dependent Synaptic Plasticity Running head: Reinforcement Learning with STDP

Spike timing-dependent synaptic plasticity (STDP) has emerged as the preferred framework linking patterns of pre-and postsynaptic activity to changes in synaptic strength. Although synaptic plasticity is widely believed to be a major component of learning, it is unclear how STDP itself could serve as a mechanism for general purpose learning. On the other hand, algorithms for reinforcement learn...

متن کامل

Spike timing dependent plasticity: mechanisms, significance, and controversies

Long-term modification of synaptic strength is one of the basic mechanisms of memory formation and activity-dependent refinement of neural circuits. This idea was purposed by Hebb to provide a basis for the formation of a cell assembly. Repetitive correlated activity of pre-synaptic and post-synaptic neurons can induce long-lasting synaptic strength modification, the direction and extent of whi...

متن کامل

Spike timing dependent plasticity: mechanisms, significance, and controversies

Long-term modification of synaptic strength is one of the basic mechanisms of memory formation and activity-dependent refinement of neural circuits. This idea was purposed by Hebb to provide a basis for the formation of a cell assembly. Repetitive correlated activity of pre-synaptic and post-synaptic neurons can induce long-lasting synaptic strength modification, the direction and extent of whi...

متن کامل

Optimal Spike-Timing Dependent Plasticity for Precise Action Potential Firing in Supervised Learing

In timing-based neural codes, neurons have to emit action potentials at precise moments in time. We use a supervised learning paradigm to derive a synaptic update rule that optimizes via gradient ascent the likelihood of postsynaptic firing at one or several desired firing times. We find that the optimal strategy of up and down regulating synaptic efficacies depends on the relative timing betwe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of neurophysiology

دوره 98 6  شماره 

صفحات  -

تاریخ انتشار 2007